Informative Data Projections: A Framework and Two Examples
نویسندگان
چکیده
Projection Pursuit aims to facilitate visual exploration of high-dimensional data by identifying interesting low-dimensional projections. A major challenge in Projection Pursuit is the design of a projection index—a suitable quality measure to maximise. We introduce a strategy for tackling this problem based on quantifying the amount of information a projection conveys, given a user’s prior beliefs about the data. The resulting projection index is a subjective quantity, explicitly dependent on the intended user. As an illustration, we developed this principle for two kinds of prior beliefs; the first leads to PCA, the second leads to a novel projection index, which we call t-PCA, that can be regarded as a robust PCA-variant. We demonstrate t-PCA’s usefulness in comparative experiments against PCA and FastICA, a popular PP method.
منابع مشابه
Stein’s Identity,Fisher Information, and Projection Pur- suit: A Triangulation
Two separate structure discovery properties of Fisher’s LDF are derived in a mixture multivariate normal setting. One of the properties is related to Fisher information and is proved by using Stein’s identity. The other property is on lack of unimodality. The properties are used to give three selection rules for choice of informative projections of high dimensional data, not necessarily multiva...
متن کاملUltra-Fast Image Reconstruction of Tomosynthesis Mammography Using GPU
Digital Breast Tomosynthesis (DBT) is a technology that creates three dimensional (3D) images of breast tissue. Tomosynthesis mammography detects lesions that are not detectable with other imaging systems. If image reconstruction time is in the order of seconds, we can use Tomosynthesis systems to perform Tomosynthesis-guided Interventional procedures. This research has been designed to study u...
متن کاملThe Grassmannian Atlas: A General Framework for Exploring Linear Projections of High-Dimensional Data
Linear projections are one of the most common approaches to visualize high-dimensional data. Since the space of possible projections is large, existing systems usually select a small set of interesting projections by ranking a large set of candidate projections based on a chosen quality measure. However, while highly ranked projections can be informative, some lower ranked ones could offer impo...
متن کاملActive Learning for Informative Projection Retrieval
We introduce an active learning framework designed to train classification models which use informative projections. Our approach works with the obtained lowdimensional models in finding unlabeled data for annotation by experts. The advantage of our approach is that the labeling effort is expended mainly on samples which benefit models from the considered hypothesis class. This results in an im...
متن کاملLearning Sparse Representations of High Dimensional Data on Large Scale Dictionaries
Learning sparse representations on data adaptive dictionaries is a state-of-the-art method for modeling data. But when the dictionary is large and the data dimension is high, it is a computationally challenging problem. We explore three aspects of the problem. First, we derive new, greatly improved screening tests that quickly identify codewords that are guaranteed to have zero weights. Second,...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1511.08762 شماره
صفحات -
تاریخ انتشار 2015